Search CORE

52 research outputs found

Introduction to Machine Learning and Bioinformatics

Author: Markus Schmidberger
Publication venue
Publication date
Field of study

Research Papers in Economics

Introduction to Machine Learning and Bioinformatics

Author: Schmidberger Markus
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/01/2008
Field of study

Abstracts not available for BookReview

Open Access LMU

Journal of Statistical Software

[Rezension zu] Introduction to Machine Learning and Bioinformatics. Sushmita Mitra, Sujay Datta, Theodore Perkins, and George Michailidis. - Chapman & Hall/CRC, Boca Raton, Florida, 2008

Author: Schmidberger Markus
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/01/2008
Field of study

Parallel Computing for Biological Data

Author: Schmidberger Markus
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 18/11/2009
Field of study

In the 1990s a number of technological innovations appeared that revolutionized biology, and 'Bioinformatics' became a new scientific discipline. Microarrays can measure the abundance of tens of thousands of mRNA species, data on the complete genomic sequences of many different organisms are available, and other technologies make it possible to study various processes at the molecular level. In Bioinformatics and Biostatistics, current research and computations are limited by the available computer hardware. However, this problem can be solved using high-performance computing resources. There are several reasons for the increased focus on high-performance computing: larger data sets, increased computational requirements stemming from more sophisticated methodologies, and latest developments in computer chip production. The open-source programming language 'R' was developed to provide a powerful and extensible environment for statistical and graphical techniques. There are many good reasons for preferring R to other software or programming languages for scientific computations (in statistics and biology). However, the development of the R language was not aimed at providing a software for parallel or high-performance computing. Nonetheless, during the last decade, a great deal of research has been conducted on using parallel computing techniques with R. This PhD thesis demonstrates the usefulness of the R language and parallel computing for biological research. It introduces parallel computing with R, and reviews and evaluates existing techniques and R packages for parallel computing on Computer Clusters, on Multi-Core Systems, and in Grid Computing. From a computer-scientific point of view the packages were examined as to their reusability in biological applications, and some upgrades were proposed. Furthermore, parallel applications for next-generation sequence data and preprocessing of microarray data were developed. Microarray data are characterized by high levels of noise and bias. As these perturbations have to be removed, preprocessing of raw data has been a research topic of high priority over the past few years. A new Bioconductor package called affyPara for parallelized preprocessing of high-density oligonucleotide microarray data was developed and published. The partition of data can be performed on arrays using a block cyclic partition, and, as a result, parallelization of algorithms becomes directly possible. Existing statistical algorithms and data structures had to be adjusted and reformulated for the use in parallel computing. Using the new parallel infrastructure, normalization methods can be enhanced and new methods became available. The partition of data and distribution to several nodes or processors solves the main memory problem and accelerates the methods by up to the factor fifteen for 300 arrays or more. The final part of the thesis contains a huge cancer study analysing more than 7000 microarrays from a publicly available database, and estimating gene interaction networks. For this purpose, a new R package for microarray data management was developed, and various challenges regarding the analysis of this amount of data are discussed. The comparison of gene networks for different pathways and different cancer entities in the new amount of data partly confirms already established forms of gene interaction

Digitale Hochschulschriften der LMU

affyPara—a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data

Author: Mansmann Ulrich
Schmidberger Markus
Vicedo Esmeralda
Publication venue: Libertas Academica
Publication date: 01/01/2009
Field of study

Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule’s prediction quality honestly

CiteSeerX

Directory of Open Access Journals

PubMed Central

State-of-the-Art in Parallel Computing with R

Author: Eddelbuettel Dirk
Mansmann Ulrich
Morgan Martin
Schmidberger Markus
Tierney Luke
Yu Hao
Publication venue
Publication date: 01/01/2009
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix

Crossref

Directory of Open Access Journals

Open Access LMU

Journal of Statistical Software

State of the Art in Parallel Computing with R

Author: Dirk Eddelbuettel
Hao Yu
Luke Tierney
Markus Schmidberger
Martin Morgan
Ulrich Mansmann
Publication venue
Publication date
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.

Research Papers in Economics

Determination of cell survival after irradiation via clonogenic assay versus multiple MTT Assay - A comparative study

Author: Buch Karl
Langguth Peter
Nawroth Thomas
Peters Tanja
Schmidberger Heinz
Sänger Markus
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

For studying proliferation and determination of survival of cancer cells after irradiation, the multiple MTT assay, based on the reduction of a yellow water soluble tetrazolium salt to a purple water insoluble formazan dye by living cells was modified from a single-point towards a proliferation assay. This assay can be performed with a large number of samples in short time using multi-well-plates, assays can be performed semi-automatically with a microplate reader. Survival, the calculated parameter in this assay, is determined mathematically. Exponential growth in both control and irradiated groups was proven as the underlying basis of the applicability of the multiple MTT assay. The equivalence to a clonogenic survival assay with its disadvantages such as time consumption was proven in two setups including plating of cells before and after irradiation. Three cell lines (A 549, LN 229 and F 98) were included in the experiment to study its principal and general applicability

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Gutenberg Open

Prospective, open, multi-centre phase I/II trial to assess safety and efficacy of neoadjuvant radiochemotherapy with docetaxel and oxaliplatin in patients with adenocarcinoma of the oesophagogastric junction

Author: Arnold Dirk
Brenner Baruch
Galle Peter R.
Gockel Ines
Klautke Gunther
Lang Hauke
Möhler Markus
Roessler Hans-Peter
Rödel Claus
Schimanski Carl Christoph
Schmidberger Heinz
Thomaidis Thomas
Trarbach Tanja
Publication venue
Publication date: 11/02/2013
Field of study

Background: This phase I/II-trial assessed the dose-limiting toxicities (DLT) and maximum tolerated dose (MTD) of neoadjuvant radiochemotherapy (RCT) with docetaxel and oxaliplatin in patients with locally advanced adenocarcinoma of the oesophagogastric junction. Methods: Patients received neoadjuvant radiotherapy (50.4 Gy) together with weekly docetaxel (20 mg/m2 at dose level (DL) 1 and 2, 25 mg/m2 at DL 3) and oxaliplatin (40 mg/m2 at DL 1, 50 mg/m2 at DL 2 and 3) over 5 weeks. The primary endpoint was the DLT and the MTD of the RCT regimen. Secondary endpoints included overall response rate (ORR) and progression-free survival (PFS). Results: A total of 24 patients were included. Four patients were treated at DL 1, 13 patients at DL 2 and 7 patients at DL 3. The MTD of the RCT was considered DL 2 with docetaxel 20 mg/m2 and oxaliplatin 50 mg/m2. Objective response (CR/PR) was observed in 32% (7/22) of patients. Eighteen patients (75%) underwent surgery after RCT. The median PFS for all patients (n = 24) was 6.5 months. The median overall survival for all patients (n = 24) was 16.3 months. Patients treated at DL 2 had a median overall survival of 29.5 months. Conclusion: Neoadjuvant RCT with docetaxel 20 mg/m2 and oxaliplatin 50 mg/m2 was effective and showed a good toxicity profile. Future studies should consider the addition of targeted therapies to current neoadjuvant therapy regimens to further improve the outcome of patients with advanced cancer of the oesophagogastric junction. Trial Registration: NCT0037498

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main

Conceptual Aspects of Large Meta-Analyses with Publicly Available Microarray Data: A Case Study in Oncology

Author: Brandt A.
Gentleman Robert C
Kalisch Markus
Landgren O.
Ludwig Arnold
Mansmann Ulrich
Onureena Banerjee
R development core team.
Schaefer Juliane
Schmidberger Markus
Villers Fanny
Wainwright Martin J
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

Large public repositories of microarray experiments offer an abundance of biological data. It is of interest to use and to combine the available material to create new biological information and to develop a broader view on biological phenomena

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central